Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 10000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.5 MiB |
| Average record size in memory | 257.8 B |
Variable types
| Numeric | 6 |
|---|---|
| Categorical | 6 |
number_products is highly correlated with exited | High correlation |
exited is highly correlated with number_products | High correlation |
df_index is uniformly distributed | Uniform |
df_index has unique values | Unique |
tenure has 413 (4.1%) zeros | Zeros |
balance has 3617 (36.2%) zeros | Zeros |
Reproduction
| Analysis started | 2022-03-23 12:15:18.116024 |
|---|---|
| Analysis finished | 2022-03-23 12:23:46.231661 |
| Duration | 8 minutes and 28.12 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 10000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5000.5 |
| Minimum | 1 |
|---|---|
| Maximum | 10000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 500.95 |
| Q1 | 2500.75 |
| median | 5000.5 |
| Q3 | 7500.25 |
| 95-th percentile | 9500.05 |
| Maximum | 10000 |
| Range | 9999 |
| Interquartile range (IQR) | 4999.5 |
Descriptive statistics
| Standard deviation | 2886.89568 |
|---|---|
| Coefficient of variation (CV) | 0.5773214038 |
| Kurtosis | -1.2 |
| Mean | 5000.5 |
| Median Absolute Deviation (MAD) | 2500 |
| Skewness | 0 |
| Sum | 50005000 |
| Variance | 8334166.667 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 6671 | 1 | < 0.1% |
| 6664 | 1 | < 0.1% |
| 6665 | 1 | < 0.1% |
| 6666 | 1 | < 0.1% |
| 6667 | 1 | < 0.1% |
| 6668 | 1 | < 0.1% |
| 6669 | 1 | < 0.1% |
| 6670 | 1 | < 0.1% |
| 6672 | 1 | < 0.1% |
| Other values (9990) | 9990 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 10000 | 1 | |
| 9999 | 1 | |
| 9998 | 1 | |
| 9997 | 1 | |
| 9996 | 1 | |
| 9995 | 1 | |
| 9994 | 1 | |
| 9993 | 1 | |
| 9992 | 1 | |
| 9991 | 1 |
credit_rating
Real number (ℝ≥0)
| Distinct | 460 |
|---|---|
| Distinct (%) | 4.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 650.5288 |
| Minimum | 350 |
|---|---|
| Maximum | 850 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 350 |
|---|---|
| 5-th percentile | 489 |
| Q1 | 584 |
| median | 652 |
| Q3 | 718 |
| 95-th percentile | 812 |
| Maximum | 850 |
| Range | 500 |
| Interquartile range (IQR) | 134 |
Descriptive statistics
| Standard deviation | 96.65329874 |
|---|---|
| Coefficient of variation (CV) | 0.14857651 |
| Kurtosis | -0.4257256848 |
| Mean | 650.5288 |
| Median Absolute Deviation (MAD) | 67 |
| Skewness | -0.0716066082 |
| Sum | 6505288 |
| Variance | 9341.860157 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 850 | 233 | 2.3% |
| 678 | 63 | 0.6% |
| 655 | 54 | 0.5% |
| 705 | 53 | 0.5% |
| 667 | 53 | 0.5% |
| 684 | 52 | 0.5% |
| 670 | 50 | 0.5% |
| 651 | 50 | 0.5% |
| 683 | 48 | 0.5% |
| 652 | 48 | 0.5% |
| Other values (450) | 9296 |
| Value | Count | Frequency (%) |
| 350 | 5 | |
| 351 | 1 | < 0.1% |
| 358 | 1 | < 0.1% |
| 359 | 1 | < 0.1% |
| 363 | 1 | < 0.1% |
| 365 | 1 | < 0.1% |
| 367 | 1 | < 0.1% |
| 373 | 1 | < 0.1% |
| 376 | 2 | < 0.1% |
| 382 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 850 | 233 | |
| 849 | 8 | 0.1% |
| 848 | 5 | 0.1% |
| 847 | 6 | 0.1% |
| 846 | 5 | 0.1% |
| 845 | 6 | 0.1% |
| 844 | 7 | 0.1% |
| 843 | 2 | < 0.1% |
| 842 | 7 | 0.1% |
| 841 | 12 | 0.1% |
country
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 615.4 KiB |
| France | |
|---|---|
| Germany | |
| Spain |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 6.0032 |
| Min length | 5 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | France |
|---|---|
| 2nd row | Spain |
| 3rd row | France |
| 4th row | France |
| 5th row | Spain |
Common Values
| Value | Count | Frequency (%) |
| France | 5014 | |
| Germany | 2509 | |
| Spain | 2477 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| france | 5014 | |
| germany | 2509 | |
| spain | 2477 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 604.7 KiB |
| Male | |
|---|---|
| Female |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.9086 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Female |
|---|---|
| 2nd row | Female |
| 3rd row | Female |
| 4th row | Female |
| 5th row | Female |
Common Values
| Value | Count | Frequency (%) |
| Male | 5457 | |
| Female | 4543 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| male | 5457 | |
| female | 4543 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
age
Real number (ℝ≥0)
| Distinct | 150 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 74.4136 |
| Minimum | 0 |
|---|---|
| Maximum | 149 |
| Zeros | 62 |
| Zeros (%) | 0.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 7 |
| Q1 | 37 |
| median | 74 |
| Q3 | 112 |
| 95-th percentile | 142 |
| Maximum | 149 |
| Range | 149 |
| Interquartile range (IQR) | 75 |
Descriptive statistics
| Standard deviation | 43.50270777 |
|---|---|
| Coefficient of variation (CV) | 0.5846069505 |
| Kurtosis | -1.208103612 |
| Mean | 74.4136 |
| Median Absolute Deviation (MAD) | 38 |
| Skewness | 0.01045541717 |
| Sum | 744136 |
| Variance | 1892.485584 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 17 | 83 | 0.8% |
| 131 | 83 | 0.8% |
| 67 | 83 | 0.8% |
| 113 | 83 | 0.8% |
| 29 | 83 | 0.8% |
| 116 | 82 | 0.8% |
| 24 | 81 | 0.8% |
| 1 | 81 | 0.8% |
| 142 | 81 | 0.8% |
| 136 | 80 | 0.8% |
| Other values (140) | 9180 |
| Value | Count | Frequency (%) |
| 0 | 62 | |
| 1 | 81 | |
| 2 | 74 | |
| 3 | 46 | |
| 4 | 63 | |
| 5 | 68 | |
| 6 | 60 | |
| 7 | 77 | |
| 8 | 67 | |
| 9 | 69 |
| Value | Count | Frequency (%) |
| 149 | 76 | |
| 148 | 68 | |
| 147 | 65 | |
| 146 | 74 | |
| 145 | 60 | |
| 144 | 58 | |
| 143 | 68 | |
| 142 | 81 | |
| 141 | 74 | |
| 140 | 61 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.0128 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 413 |
| Zeros (%) | 4.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 9 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.892174377 |
|---|---|
| Coefficient of variation (CV) | 0.5769578633 |
| Kurtosis | -1.165225227 |
| Mean | 5.0128 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.01099145798 |
| Sum | 50128 |
| Variance | 8.364672627 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=11)
| Value | Count | Frequency (%) |
| 2 | 1048 | |
| 1 | 1035 | |
| 7 | 1028 | |
| 8 | 1025 | |
| 5 | 1012 | |
| 3 | 1009 | |
| 4 | 989 | |
| 9 | 984 | |
| 6 | 967 | |
| 10 | 490 |
| Value | Count | Frequency (%) |
| 0 | 413 | 4.1% |
| 1 | 1035 | |
| 2 | 1048 | |
| 3 | 1009 | |
| 4 | 989 | |
| 5 | 1012 | |
| 6 | 967 | |
| 7 | 1028 | |
| 8 | 1025 | |
| 9 | 984 |
| Value | Count | Frequency (%) |
| 10 | 490 | |
| 9 | 984 | |
| 8 | 1025 | |
| 7 | 1028 | |
| 6 | 967 | |
| 5 | 1012 | |
| 4 | 989 | |
| 3 | 1009 | |
| 2 | 1048 | |
| 1 | 1035 |
| Distinct | 6382 |
|---|---|
| Distinct (%) | 63.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 76485.88929 |
| Minimum | 0 |
|---|---|
| Maximum | 250898.09 |
| Zeros | 3617 |
| Zeros (%) | 36.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 97198.54 |
| Q3 | 127644.24 |
| 95-th percentile | 162711.669 |
| Maximum | 250898.09 |
| Range | 250898.09 |
| Interquartile range (IQR) | 127644.24 |
Descriptive statistics
| Standard deviation | 62397.4052 |
|---|---|
| Coefficient of variation (CV) | 0.8158028335 |
| Kurtosis | -1.489411768 |
| Mean | 76485.88929 |
| Median Absolute Deviation (MAD) | 46766.79 |
| Skewness | -0.1411087109 |
| Sum | 764858892.9 |
| Variance | 3893436176 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 3617 | |
| 130170.82 | 2 | < 0.1% |
| 105473.74 | 2 | < 0.1% |
| 85304.27 | 1 | < 0.1% |
| 159397.75 | 1 | < 0.1% |
| 144238.7 | 1 | < 0.1% |
| 112262.84 | 1 | < 0.1% |
| 109106.8 | 1 | < 0.1% |
| 142147.32 | 1 | < 0.1% |
| 109109.33 | 1 | < 0.1% |
| Other values (6372) | 6372 |
| Value | Count | Frequency (%) |
| 0 | 3617 | |
| 3768.69 | 1 | < 0.1% |
| 12459.19 | 1 | < 0.1% |
| 14262.8 | 1 | < 0.1% |
| 16893.59 | 1 | < 0.1% |
| 23503.31 | 1 | < 0.1% |
| 24043.45 | 1 | < 0.1% |
| 27288.43 | 1 | < 0.1% |
| 27517.15 | 1 | < 0.1% |
| 27755.97 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 250898.09 | 1 | |
| 238387.56 | 1 | |
| 222267.63 | 1 | |
| 221532.8 | 1 | |
| 216109.88 | 1 | |
| 214346.96 | 1 | |
| 213146.2 | 1 | |
| 212778.2 | 1 | |
| 212696.32 | 1 | |
| 212692.97 | 1 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 566.5 KiB |
| 1 | |
|---|---|
| 2 | |
| 3 | 266 |
| 4 | 60 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 3 |
| 4th row | 2 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 5084 | |
| 2 | 4590 | |
| 3 | 266 | 2.7% |
| 4 | 60 | 0.6% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 1 | 5084 | |
| 2 | 4590 | |
| 3 | 266 | 2.7% |
| 4 | 60 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
credit_card
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 566.5 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 7055 | |
| 0 | 2945 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 1 | 7055 | |
| 0 | 2945 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
is_active
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 566.5 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 5151 | |
| 0 | 4849 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 1 | 5151 | |
| 0 | 4849 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
estimated_salary
Real number (ℝ≥0)
| Distinct | 9999 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 100090.2399 |
| Minimum | 11.58 |
|---|---|
| Maximum | 199992.48 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 11.58 |
|---|---|
| 5-th percentile | 9851.8185 |
| Q1 | 51002.11 |
| median | 100193.915 |
| Q3 | 149388.2475 |
| 95-th percentile | 190155.3755 |
| Maximum | 199992.48 |
| Range | 199980.9 |
| Interquartile range (IQR) | 98386.1375 |
Descriptive statistics
| Standard deviation | 57510.49282 |
|---|---|
| Coefficient of variation (CV) | 0.5745864221 |
| Kurtosis | -1.181518447 |
| Mean | 100090.2399 |
| Median Absolute Deviation (MAD) | 49198.15 |
| Skewness | 0.002085357662 |
| Sum | 1000902399 |
| Variance | 3307456784 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 24924.92 | 2 | < 0.1% |
| 101348.88 | 1 | < 0.1% |
| 55313.44 | 1 | < 0.1% |
| 72500.68 | 1 | < 0.1% |
| 182692.8 | 1 | < 0.1% |
| 4993.94 | 1 | < 0.1% |
| 124964.82 | 1 | < 0.1% |
| 161971.42 | 1 | < 0.1% |
| 39488.04 | 1 | < 0.1% |
| 187811.71 | 1 | < 0.1% |
| Other values (9989) | 9989 |
| Value | Count | Frequency (%) |
| 11.58 | 1 | |
| 90.07 | 1 | |
| 91.75 | 1 | |
| 96.27 | 1 | |
| 106.67 | 1 | |
| 123.07 | 1 | |
| 142.81 | 1 | |
| 143.34 | 1 | |
| 178.19 | 1 | |
| 216.27 | 1 |
| Value | Count | Frequency (%) |
| 199992.48 | 1 | |
| 199970.74 | 1 | |
| 199953.33 | 1 | |
| 199929.17 | 1 | |
| 199909.32 | 1 | |
| 199862.75 | 1 | |
| 199857.47 | 1 | |
| 199841.32 | 1 | |
| 199808.1 | 1 | |
| 199805.63 | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 566.5 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 7963 | |
| 1 | 2037 | 20.4% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0 | 7963 | |
| 1 | 2037 | 20.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | credit_rating | country | gender | age | tenure | balance | number_products | credit_card | is_active | estimated_salary | exited | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 619 | France | Female | 110 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 |
| 1 | 2 | 608 | Spain | Female | 38 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 |
| 2 | 3 | 502 | France | Female | 54 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 |
| 3 | 4 | 699 | France | Female | 0 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 |
| 4 | 5 | 850 | Spain | Female | 54 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 |
| 5 | 6 | 645 | Spain | Male | 35 | 8 | 113755.78 | 2 | 1 | 0 | 149756.71 | 1 |
| 6 | 7 | 822 | France | Male | 144 | 7 | 0.00 | 2 | 1 | 1 | 10062.80 | 0 |
| 7 | 8 | 376 | Germany | Female | 127 | 4 | 115046.74 | 4 | 1 | 0 | 119346.88 | 1 |
| 8 | 9 | 501 | France | Male | 96 | 4 | 142051.07 | 2 | 0 | 1 | 74940.50 | 0 |
| 9 | 10 | 684 | France | Male | 96 | 2 | 134603.88 | 1 | 1 | 1 | 71725.73 | 0 |
Last rows
| df_index | credit_rating | country | gender | age | tenure | balance | number_products | credit_card | is_active | estimated_salary | exited | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9990 | 9991 | 714 | Germany | Male | 114 | 3 | 35016.60 | 1 | 1 | 0 | 53667.08 | 0 |
| 9991 | 9992 | 597 | France | Female | 65 | 4 | 88381.21 | 1 | 1 | 0 | 69384.71 | 1 |
| 9992 | 9993 | 726 | Spain | Male | 104 | 2 | 0.00 | 1 | 1 | 0 | 195192.40 | 0 |
| 9993 | 9994 | 644 | France | Male | 103 | 7 | 155060.41 | 1 | 1 | 0 | 29179.52 | 0 |
| 9994 | 9995 | 800 | France | Female | 44 | 2 | 0.00 | 2 | 0 | 0 | 167773.55 | 0 |
| 9995 | 9996 | 771 | France | Male | 119 | 5 | 0.00 | 2 | 1 | 0 | 96270.64 | 0 |
| 9996 | 9997 | 516 | France | Male | 83 | 10 | 57369.61 | 1 | 1 | 1 | 101699.77 | 0 |
| 9997 | 9998 | 709 | France | Female | 39 | 7 | 0.00 | 1 | 0 | 1 | 42085.58 | 1 |
| 9998 | 9999 | 772 | Germany | Male | 70 | 3 | 75075.31 | 2 | 1 | 0 | 92888.52 | 1 |
| 9999 | 10000 | 792 | France | Female | 87 | 4 | 130142.79 | 1 | 1 | 0 | 38190.78 | 0 |